Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct lock ASSERTs in vdev_label_read/write #5983

Merged
merged 1 commit into from
Apr 21, 2017

Conversation

ofaaland
Copy link
Contributor

@ofaaland ofaaland commented Apr 7, 2017

Description

Both vdev_label_{read,write}() are changed to assert SCL_STATE is held
as RW_READER, instead of assertions which required more config locks be held.

Motivation and Context

The existing assertions in vdev_label_read() and vdev_label_write(),
testing which config locks are held, are incorrect. The assertions
test for locks which exceed what is required for safety.

This is safe because:

Changes to the vdev tree occur under SCL_ALL as RW_WRITER, via
spa_vdev_enter() and spa_vdev_exit().

Changes to vdev state occur under SCL_STATE_ALL as RW_WRITER, via
spa_vdev_state_enter() and spa_vdev_state_exit().

Therefore, the new assertions guarantee that the vdev cannot change
out from under a zio, and I/O to a specified leaf vdev's label is
safe.

Furthermore, this is consistent with the SPA locking discussion in
spa_misc.c, "For any zio operation that takes an explicit vdev_t
argument ... zio_read_phys(), or zio_write_phys() ... SCL_STATE as
reader suffices."

How Has This Been Tested?

Built ZFS with --enable-debug and ran zloop for about 2 hours.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@mention-bot
Copy link

@ofaaland, thanks for your PR! By analyzing the history of the files in this pull request, we identified @behlendorf, @dpquigl and @grwilson to be potential reviewers.

@tuxoko
Copy link
Contributor

tuxoko commented Apr 7, 2017

I don't see how this would work.

vdev_read_label_config which calls vdev_label_read also has ASSERT(spa_config_held(spa, SCL_STATE_ALL, RW_WRITER) == SCL_STATE_ALL);

vdev_label_init which calls vdev_label_write also has ASSERT(spa_config_held(spa, SCL_ALL, RW_WRITER) == SCL_ALL);

@ofaaland
Copy link
Contributor Author

ofaaland commented Apr 9, 2017

@tuxoko , I'm not sure what the concern is, so if my answers below address the wrong things, let me know.

  1. SCL_STATE is one of the locks in SCL_STATE_ALL and in SCL_ALL. So those two functions you mention hold the lock I'm checking for, and others in addition. But they hold more locks than are necessary for the vdev_label_{read,write} calls to be safe.
  2. When a config lock is taken as writer, as I recall, spa_config_held(spa, lock, RW_READER) returns lock. I can find that code snippet again if that's your concern.

@behlendorf behlendorf requested a review from ahrens April 10, 2017 16:45
@tuxoko
Copy link
Contributor

tuxoko commented Apr 10, 2017

@ofaaland

When a config lock is taken as writer, as I recall, spa_config_held(spa, lock, RW_READER) returns lock.

I see, I didn't expect that.

@ofaaland
Copy link
Contributor Author

ofaaland commented Apr 10, 2017

@tuxoko @behlendorf ,

I double-checked spa_misc.c, and spa_config_enter() calls refcount_add(scl_count) for both RW_READER and RW_WRITER. spa_confi_held(RW_READER) then checks that !refcount_is_zero(&scl->scl_count).

However, perhaps I'm being too clever and should make the ASSERT check for either

(spa_config_held(RW_READER) || spa_config_held(RW_WRITER))

So it's clearer to someone reading it what is going on, without making them look at spa_config_* in detail.

Thoughts?

@behlendorf
Copy link
Contributor

The usage you've proposed in the existing PR already appears all over the code base so I don't think it will be too confusing. That said, I personally find your updated version easier to understand so let's go with that.

The existing assertions in vdev_label_read() and vdev_label_write(),
testing which config locks are held, are incorrect. The assertions
test for locks which exceed what is required for safety.

Both vdev_label_{read,write}() are changed to assert SCL_STATE is held
as RW_READER or RW_WRITER. This is safe because:

Changes to the vdev tree occur under SCL_ALL as RW_WRITER, via
spa_vdev_enter() and spa_vdev_exit().

Changes to vdev state occur under SCL_STATE_ALL as RW_WRITER, via
spa_vdev_state_enter() and spa_vdev_state_exit().

Therefore, the new assertions guarantee that the vdev cannot change
out from under a zio, and I/O to a specified leaf vdev's label is
safe.

Furthermore, this is consistent with the SPA locking discussion in
spa_misc.c, "For any zio operation that takes an explicit vdev_t
argument ... zio_read_phys(), or zio_write_phys() ... SCL_STATE as
reader suffices."

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
ASSERT(spa_config_held(zio->io_spa, SCL_ALL, RW_WRITER) == SCL_ALL ||
(spa_config_held(zio->io_spa, SCL_CONFIG | SCL_STATE, RW_READER) ==
(SCL_CONFIG | SCL_STATE) &&
dsl_pool_sync_context(spa_get_dsl(zio->io_spa))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the dsl_pool_sync_context intentionally removed?

Copy link
Contributor Author

@ofaaland ofaaland Apr 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

dsl_pool_sync_context() == B_TRUE in two cases; one is that the caller is the sync thread or a thread performing a sync task. Based on all vdev state and vdev tree changes occurring under RW_WRITER, doing I/O is safe under only RW_READER regardless of which thread is doing it.

The other case, is that spa.spa_is_initializing == B_TRUE. That field is only true during calls to dsl_pool_create() and dsl_pool_open(), within spa_create() and spa_load_impl() respectively, both of which are setting up facilities that aren't necessary for reading from or writing to the labels on leaf vdevs.

I believe this was simply copied from the code protecting the dirty state or config lists, where the list is altered by the sync thread with only the read lock held. In the context of protecting those lists, the restriction makes sense - you can guarantee there is only one writer, since there is only one sync thread at the point where those lists are modified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuxoko ,
Do you approve of the patch now? Or do you have concerns I haven't addressed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't disapprove just have some questions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you

@behlendorf behlendorf merged commit 0091d66 into openzfs:master Apr 21, 2017
@ofaaland ofaaland deleted the b_vdev_label_read_locking branch April 24, 2017 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants